An information-theoretic approach to automatic query expansion
نویسندگان
چکیده
منابع مشابه
Automatic Word Categorization: An Information-theoretic Approach
This paper presents a novel approach to the automatic categorization of words from raw data. We count occurrences of word pairs in text and use a hierarchical clustering technique on this frequency data to obtain a classification of words into linguistic categories. As a distance criterion in the clustering process, we use the loss of mutual information caused by combining two clusters into a s...
متن کاملAn Information-Theoretic Approach to Automatic Evaluation of Summaries
Until recently there are no common, convenient, and repeatable evaluation methods that could be easily applied to support fast turn-around development of automatic text summarization systems. In this paper, we introduce an informationtheoretic approach to automatic evaluation of summaries based on the Jensen-Shannon divergence of distributions between an automatic summary and a set of reference...
متن کاملInformation Term Selection for Automatic Query Expansion
Techniques for query expansion from top retrieved documents have been recently used by many groups at TREC, often on a purely empirical ground. In this paper we present a novel method for ranking and weighting expansion terms. The method is based on the concept of relative entropy, or Kullback-Lieber distance, developed in Information Theory, from which we derive a computationally simple and th...
متن کاملA Query Expansion Approach to Cross Language Information Retrieval
This paper presents an approach to developing a cross-language information retrieval system (CLIR) whose input is a natural language query written in Filipino and the target documents are written in English and Filipino. Is it possible to apply existing approaches to CLIR to a Filipino to English system? Which linguistic resources are needed by this system?
متن کامل15.5 A Blind Information Theoretic Approach to Automatic Signal Classification
Recent information theoretic approaches for empirical signal classification have been developed for applications where labeled training data from each of the signal sources is available. These so-called %niversal>’ classifiers have been shown to be asymptotically optimal under very broad statistical conditions on the signals of interest and have recently been successfully applied to problems in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Information Systems
سال: 2001
ISSN: 1046-8188,1558-2868
DOI: 10.1145/366836.366860